A class of non homogeneous self interacting 

random processes with applications to 
Learning in Games and Vertex-Reinforced 

Random Walks* 

Michel Bena'fm 

Universite de Neuchatel, Suisse 

Olivier Raimond 

Universite Paris Sud, France 

June 5, 2008 



Abstract 

Using an approximation by a set-valued dynamical system, this pa- 
per studies a class of non Markovian and non homogeneous stochastic 
processes on a finite state space. It provides an unified approach to 
simulated annealing type processes. It permits to study new models of 
vertex reinforced random walks and new models of learning in games 
including Markovian fictitious play. 
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1 Introduction 

Let E be a finite set called the state space, M = M(E) the set of Markov 
matrices over E, and S a compact convex subset of an Euclidean space 

*We acknowledge financial support from the Swiss National Science Foundation grant 
200020-112316. 
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called the observation space. The set S will be equipped with the distance 
induced by the Euclidean norm || • || on the observation space. Let (fl,J-, P) 
be a probability space equipped with an increasing sequence of sub cx-fields 
{T n , neN}: T n C F n+l C T. 

Our main object of interest is a discrete time random process (X, M, V) = 
((X n , M n , V n )) defined on (fl, T, P) taking values in E x M(E) x £ such that: 

(i) (X,M,V) is adapted (to {T n ,n G N}), meaning that (X n , M n ,V n ) is 

J^j-measurable for each rz. 

(ii) For all y G E 

P(X n+1 = y\F n ) = M n (X n ,y). (1) 

We refer to X n (respectively V n ) as the state (respectively, the observation) 
variable at time n; and to the sequence (M n ) as the strategy. We let 



n 



n 

i=l 

denote the empirical average up to time n of the sequence of observations. 
A well studied situation is when 

M n = K(v n ) (2) 

where K maps continuously probability vectors to irreducible Markov matri- 
ces and 

K+i = H(X n+ x,v n ) 

for some map H : E x S i— > S. In such (X n ) is called a "Markov chain 

controlled" by (v n ) and the behavior of (v n ) can be analyzed through the 
ODE 

v = —v + 7r(u) (x)H(x, v) (3) 

a; 

where 7r(v) is the invariant probability of K(v). This approach to controlled 
Markov chains goes back to the work of Metivier and Priouret (1987) (see also 
the books Benveniste, Metivier and Priouret (1990), Duflo (1996)) strongly 
influenced by the pioneered works of Ljung (1977), Kushner and Clark (1978) 
on the ODE's method. It has been used in Benai'm (1997) for analyzing 
certain vertex reinforced random walks on finite graphs. 

The main purpose of this paper is to investigate the long term behavior of 
(v n ) under less stringent assumptions than (j5J). In particular we are interested 
in situations where: 
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(a) M n may depend on other (non-observable or hidden) variables than v n 

and; 

(b) The closure of {M n : n > 0} may contain degenerate (i.e non irre- 

ducible) Markov matrices. 

Situation (a) typically occurs in game theory where players may have only 
partial information on the actions played by their opponents, and (b) is 
motivated by stochastic optimization algorithms. 

Relying on a recent paper by Benaim, Hofbauer and Sorin (2005) it will 
be shown that under certain assumptions (involving estimates on the log- 
Sobolev and spectral gap constants of (M„)) the asymptotic behavior can be 
described in term of a certain set- valued deterministic dynamical system that 
generalizes the ODE ([3]). Applications to non- homogeneous Markov chains, 
vertex reinforced random walks and learning processes in game theory will 
be given. 

Outline of contents 

The organization of the paper is as follows. Section [2] states the notation, 
hypotheses and the main result. Our main assumption (Hypothesis 12.11) 
is somewhat abstract and more tractable conditions (expressed in term of 
spectral gaps and log-Sobolev constants) are given in section [3j Section @] 
is devoted to examples and applications. The proof of the main result is 
postponed to section [51 

2 Notation, hypotheses and main results 

A probability vector (or measure) over E is a map /i : E — > M + such that 
^^(x) = 1, and a Markov matrix is a map M : E x E — > R + , such that 

Vx G E, ^M{x,y) = 1. 

y 

We let A = A(E) denote the space of probability vectors over E and M = 
M(E) denote the set of Markov matrices on E. 

Given a function / : E — > R and \x G A we use the notation 
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A Markov matrix M on E acts on functions / and measures /i according to 
the formulas 

Mf(x) = J2M(x,y)f(y), 

y 

liM{y) = ^2fi(x)M(x,y). 

X 

We let M n denote the Markov matrix obtained by matrix multiplication. 
Equivalents M n f = M(M n ~ l f) for n > 1, with the convention that M°f = 
/■ 

Points x,y G E are said to be related if there exist i,j > (depending 
on x and y) such that M l (x,y) > and Mi{y,x) > 0. An equivalence class 
for this relation is called a recurrent class. The Markov matrix Af on £ is 
said indecomposable if it has a unique recurrent class (possibly periodic) and 
is said irreducible if this recurrent class is E. 

By standard results, indecomposability of M implies that M possesses a 
unique invariant probability measure it characterized by the relation nM = ir. 
Moreover, the generator L = —I + M has kernel Ml and its restriction to 
{/ : Ti f = 0} is an isomorphism. It then follows that — L admits a pseudo 
"inverse" Q characterized by 

Ql = 0, 

and 

Q(I-M) = (I-M)Q = I-U; 

where IT G M denote the matrix defined by U(x,y) = 7r(y). To shorten 
notation we also call Q the pseudo inverse of M. 

Given a vector / and a matrix N, we set |/| = max|/(x)| and \N\ = 
max XiV \N(x,y)\. 

Our main assumption is the following: 

Hypothesis 2.1 The matrices (M n ) are indecomposable and their pseudo 
inverses (Q n ) and invariant probabilities {ji n ) satisfy almost surely 

W lim IQ-I 2 M") =0 , 

(") 

lim \Q n+1 - Q n \ = 0, 
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(iii) 

Hill |7T n+ i - 7T n | = 0. 

n^oo 

The verification of hypothesis 12.11 is the subject of section [3] where suffi- 
cient and more tractable conditions will be detailed. 

Let V n : E — > £ be an jF n -measurable map defined by 

nl J M n (X„,x) 
for M n (X n , x) ^ 0. In addition to hypothesis 12.11 we assume that 
Hypothesis 2.2 

lim M n+1 Q n+1 {V n+1 - V n ) = 

almost surely. 

Remark 2.3 Here are some sufficient conditions ensuring hypothesis 12.21 

(i) Assume that x i— > V^ + i(x) — V^(:r) is a constant map. Then hypothesis 

12.21 holds since Q n l = 0. This will be used in section HI 

(ii) More generally, let TE be the affine hull of £ (the smallest affine space 

containing S). Assume that for all n £ N there exists a vector A n £ TS 
and a map B n : E TE such that 

(a) For all x £ E, V n+1 (x) — = A n + 

(b) limsup^^ l^l^j^j < oo, almost surely. 

Then \M n+1 Q n+1 ((V n+1 - V n ))\ = \M n+1 Q n+1 B n \ < \Q n+1 \\B n \ -> 
almost surely by hypothesis 12.11 

(iii) Assume that M n (x, y) = 7r n (y). Then M n+ iQ n+ i = so that hypothesis 
O holds. 
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2.1 Adapted set-valued dynamical systems 

The purpose of this section is to introduce certain differential inclusions on 
E that will prove to be useful for analyzing the long term behavior of (v n ). 
Recall that we let tt h denote the invariant probability of M n . Let 

On = K n Vn = ^2 Kni^Vnix). (4) 

x 

We let C n C £ x £ denote the topological support of the law of (v n , 6 n ). That 
is the smallest closed set F C E x E such that 

P((v n ,e n )eF) = i. 

Let clos{C n } denote the set of all possible limit points z = limz nfe with 
z nk G C Hk and n& — > oo. It is easily seen that clos{C n } is a nonempty compact 
subset of E x E. 

A nonempty set G C E x E is called a graph (or a bundle) over E, if the 
projection 

p : G -> E, 
(u, v) I— > u 

is onto. A graph G over E defines a set-valued function mapping each point 
u G E to a set G{u) = {v G E : (m, t>) G G}. 

Definition 2.4 ^4 set C C E x E is said to be adapted to {(v n ,9 n )} (or 
simply adapted) if 

(i) C is a closed graph over E. 

(ii) For all u G E, C(u) is a nonempty convex set. 

(iii) clos{C n } c C. 

To an adapted set C we associate the differential inclusion 

ve-v + C(v). (5) 

A solution to (jSJ) is an absolutely continuous mapping v : R — > E verifying 
+ v(t) G C(f (t)) for almost every t. A set A C E is said to be invariant 
if for all x G A there exists a solution x to (jSJ) with x(0) = x and such that 
x(R) C A. 
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Given a set A C £ and (x, y) G A 2 we write x >a V if for every e > 
and T > there exists an integer n G N, solutions xi, . . . x n to (J5J) and real 
numbers t\, t-i, ■ ■ ■ , t n greater than T such that 

(a) Xi([0,ti]) C A, 

(b) ||xj(^) - Xj + i(0)|| < e for alH = 1, . . . ,n - 1, 

(c) ||xi(0) - x|| < e and ||x n (t n ) - y|| < e. 

Definition 2.5 ^4 sei A C S is sazrf to &e internally chain transitive provided 
A is compact and x >a 2/ /or a// x, y G A. 

It is not hard to verify (see e.g Benaim, Hofbauer and Sorin (2005) Lemma 
3.5) that an internally chain transitive set is invariant. 

The limit set of (v n ) is the set L = L((v n )) consisting of all points p = 
limt>„ fc for some sequence — ► oo. The next theorem 12.61 is the main result 
of the paper. Its proof heavily relies on Benaim, Hofbauer and Sorin (2005) 
and is given in section [5J 

Theorem 2.6 Assume that hypotheses lKT\ and [27B hold. LetC be an adapted 
graph. Then the limit set of (v n ) is an internally chain transitive set for the 
differential inclusion 

v G —V + C(v). 

2.2 Background : How to use Theorem 12J3] 

The notion of "internally chain transitive set" was introduced by Benaim 
and Hirsch (1996) in order to analyze the long term behavior of certain per- 
turbations of flows and has been recently extended to multivalued dynamical 
systems by Benaim, Hofbauer and Sorin (2005). We refer the reader to this 
paper for more details, examples and properties. For convenience this section 
briefly reviews a few useful properties of internally chain transitive sets. 

The differential inclusion (jSJ) induces a set- valued dynamical system {$t}t G R 
defined by 

$t(x) = {x(£) : x is a solution to (J5]) with x(0) = x G £}. 

A non empty compact set A is an attracting set if there exists a neigh- 
borhood U of A and a function t from (0, Eq) to 1R + with Eq > such that 

$t{U) C A e 
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for all e < Eq and t > t(e), where A £ stands for the e— neighborhood of A. If 
additionally A is invariant, then A is an attractor. 

Given an attracting set (resp. attractor) A, its basin of attraction is the 

set 

B(A) = {x E E : 3t > 0, $ t (x) G U}. 

When B{A) = E, A is a globally attracting set (resp. a global attractor). 

Given a closed invariant set S, the induced dynamical system <3> s on 5 is 
defined by 

$f (s) = {x(£) : x is a solution to © with x(0) = x and x(R) C S}. 

An invariant set S is attractor free if there exists no proper subset A oi S 
which is an attractor for <3> s . 

Throughout the remainder of this section we let L denote an internally 
chain transitive set (for instance the limit set L = L(v n )). Properties of L 
will then be obtained through the next result (Benaim, Hofbauer and Sorin 
(2005), Lemma 3.5, Proposition 3.20 and Theorem 3.23): 

Proposition 2.7 (i) The set L is non-empty, compact, invariant and at- 
tractor free. 

(ii) If A is an attracting set with B(A) fl L ^ 0, then L C A. 

Some useful properties of attracting sets or attractors are the two fol- 
lowing (Benaim, Hofbauer and Sorin (2005), Propositions 3.25 and 3.27). 

Proposition 2.8 Let A C E be compact with a bounded open neighborhood 
U and V : U — > [0, oo[. Assume the following conditions: 

(i) $ t (E7) C U for allt> 0, 

(ii) 7" 1 (0)=A, 

(iii) V is continuous and for allx G U\A,y e $t(x) andt > 0, V(y) < V(x). 
Then A contains an attractor whose basin contains U. 
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The map V introduced in this proposition is called a strong Lyapounov 
function associated to A. 

Let now A be a subset of £ and U C S an open neighborhood of A. A 
continuous function V : U — > K. is called a Lyapunov function for A C E if 
V(y) < for all a; G [/ \ A, y G t > 0; and V(y) < V(x) for all 

x E A, y E & t (x) and £ > 0. 

Proposition 2.9 (Lyapounov) Suppose V : U —>■ M. is a Lyapunov func- 
tion for A and L C U. Assume that V(A) has an empty interior. Then L C A 
and the restriction ofV to L is constant. 



3 Verification of hypothesis 12.1 



This section is devoted to the verification of Hypothesis 12.11 The results 
given here will be used in section H] to analyze specific situations. 



3.1 Estimates based on compactness 

Let Mi n d(E) denote the open set of indecomposable Markov matrices. 

Proposition 3.1 Suppose that the sequence (M n ) lies in a compact subset 
of Mi n d(E) and verifies lim ri _» 00 (M ) , + i — M n ) =0. Then hupothesis l2J\ holds. 

This proposition is a direct consequence of the next lemma. 

Lemma 3.2 Let TM(E) be the space of matrices K = K(x,y) such that 
K(x,y) = 0. The map Q : M ind (E) — > TM(E) which associates to M 
its pseudo inverse and the map 11 : M ind (E) — > A which associates to M its 
invariant measure are smooth maps. 

Proof : Set M e M in( i(E). The invariant probability of M, IT(M), is solution 
to (f)(M, 7r) = where <fi : M in <i(E) xA^ TA, is the smooth map defined by 

0(M,/i)=M/-M), 
with TA = {fj, : E -»• R : J2 X K x ) = °}- For a11 v e TA ' 

^(M,fi).u = u(I-M). 
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Hence, by uniqueness of the invariant probability measure, ■g(M, //) has 
kernel {0} and the fact that II is smooth follows from the implicit function 
theorem. 

We denote by ft(M) G M(E) the matrix defined by n(M)(z, y) = n(M)(y). 
The pseudo inverse of M is solution to ^(M, Q) = where if) : M ind (E) x 
TM(E) — ► TM(E), is the smooth map defined by 

^(M, Q) = Q(J - M) - ( J — ft(M)). 

For all A G TM(E) 

^(M,Q).A = A(I-M). 

Hence, by uniqueness of the invariant probability measure, |^(M, Q) has 
kernel {0} and the fact that Q depends smoothly on M follows from the 
implicit function theorem. QED 

Let K be a continuous mapping from T a compact set into M (E) such that 
K(w) is indecomposable for all weT. Assume (w n ) is a sequence of T-valued 
random variables such that M n = K(w n ). If in addition lim n ^ 00 (M n+1 — 
M n ) = 0, then proposition 13.11 applies. 



3.2 Estimates based on log-Sobolev and spectral gap 
constants 

Propositions 13.31 and 13.41 below can be used to verify hypothesis 12.11 when 
the sequence (M n ) is not bounded away from M in d(E). The strategy is then 
to verify assertions (ii) and (Hi) of proposition 13.31 and to use the estimates 
given by proposition 13.41 to verify assertion (i). 

Proposition 3.3 Suppose that the matrices (M n ) are indecomposable and 
that their pseudo inverse (Q n ) and invariant probabilities (ir n ) satisfy amost 
surely 

W lim |q,|»lc g (n) = 

n.^oo n 

(") 

Tt 

limsup \M n+1 - M n \- — -— < oo 
n^oo log(n) 
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(iii) 

lim sim I tt„ 11 — 7r„ I . I 

log(n 



lim sup |7r n _|_i — 7r n L / - — p— r < oo. 



TTien hypothesis \2J\ holds. 

Proof : The proof amounts to show that hypothesis 12.11 (ii) holds. Set 
L n = M n — I and II n = Il(M n ). Using the characterization of Q n one has 

Qn+l(L n+ i — L n ) + (Q n +i — Q n )L n = Il n+ i — Il n . 

Hence, 

Qn+l(L n +l — L n )Q n + (Qn+l — Qn)L n Q n = (II ra+ i — Tl n )Q n . 

That is (using Q n Il n = Q n Il n+1 = and L n Q n = Ii n - I) 

Q n+1 (M n+1 - M n )Q n + (Q n - Q n+1 ) = (n n+ i - U n )Q n . 

Therefore 

\Qn-Qn+l\ < c(\Q n+1 \\Q n \\M n+1 -M n \ + I IIQnl), 

for some constant c > and conditions (i), (ii), (iii) imply hypothesis 12.11 
(ii). QED 

Let Mi rr (E) denote the open set of irreducible Markov matrices. Let 
M e Mi rr (E) with invariant probability 7r and let / : E — > R. The variance, 
entropy and energy of / are respectively defined as 

var(f) = n(f) - (nf) 2 

£(/) = £/(*) 2 log 

£(f) = \ - /(^)) 2 M(x )2/ )7r(x). 

The spectral gap and log-Sobolev constants of M are then defined to be 

[ £(7) 

A = min < — — : var(f) ^ 

[var(f) 

a=min {w) :£(/) ^ }- 

The following estimates follows from the quantitative results for finite Markov 
chains as given in Saloff-Coste (1997) theorems. 
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Proposition 3.4 Let M e M irr (E) with invariant probability tt log-Sobolev 
constant a and spectral gap A. For all (x, y) G E the following estimates hold: 



(i) 
(") 

where log + (t) 
In particular 

and 

\Q\<j 

Proof : Let L = — I + M and let {P t } be the continuous time semi-group 
P t = e tL . Then Q can be written as 

POO 

Q(x,y)= / (P t (x,y)-7r(y))dt. 
Jo 

The first assertion then easily follows from the estimate 

\P t (x,y) -7r(y)\ < 

whose proof can be found in Saloff-Coste (1997, Corollary 2.1.5). 

We now pass to the second assertion. If n(x) > e~ 2 the inequality to be 
proved follows from inequality (i). Hence we assume that n(x) < e~ 2 , and 
we follow the line of the proof of Theorem 2.2.5 in Saloff-Coste (1997). For 
q > 1, we let ||.|| ? denotes the norm in l q (7r). We let P t * denote the adjoint 
of P t in Z 2 (7r), and p t (x,y) = p* t {y,x) = P t (x,y)/n(y). Let g x denote the 
function given by g x (y) = for x ^ y and g x (x) = l/n(x). Then 

\P t (x,y)-ir(y)\ < \\p t (x, .) - 1\\ 2 = \\(P t * - n)g x \\ 2 



\Q(x,y)\ < -log+ (log ( —r 
: max(0, log(t)). 



x) 



+ 



A 



log, log — + - 



71* 



(. ( 1 A A log((l-7r,)/7r,)v 
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Therefore 

\Pt+s(x,y) -ir(y)\ < \\p t + s (x,.) -1|| 2 < ||P t * - 7r||a— 2 | ^1 la 

— e A *l \Ps\ I >fc — s-3 1 IflU \k 

for any > 1. where we have used the fact that ||P t * — 7r 1 1 2 >2 — e ~ Xi '■ Let 

q be the Holder conjugate of k. Then ||P,*||fc^2 = ||P a ||2->g- Now choose 
q(s) = 1 + e 2as . By hypercontractivity (see Theorem 2.2.4 in Saloff-Coste 
(1997)), ||P„|| 2 -+g( a ) < 1 so that 

\p t+s {x,y)-Ay)\<z~ xt <xY 1/q{s) - 

Hence 

\Q(x,y)\ <2s + \ir(x)- 1 ^ s \ 
A 

For s = ^ log + (log(^y)) this gives the desired result. 

The uniform bounds on \Q\ follow from the rough estimates 

' ~ ~ 2 ~' ! -\<a<\/2 



log((l - 7T*)/7T*) 

given in Saloff-Coste (1997, Lemma 2.2.2 and Corollary 2.2.10) QED 



4 Some applications 

In sections 14.11 and 14.21 we are interested in the long term behavior of the 
empirical occupation measure of the process. We then let E = A, V n — 5x n 
and 




Hence, V n (x) = S x and 9 n = n. 



4.1 Markov chains 

Let (M n ) be a deterministic (or Tq measurable) sequence of Markov matrices 
over E. A non homogeneous Markov chain with transition matrices (M n ) is 
an adapted process (X n ) on E verifying (JTJ). 
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Proposition 4.1 Let L((ir n )) C A denote the limit set of (ir n ) and let 
conv[L((7r n ))] denote its convex hull. Suppose that hupothesis \2. 1\ holds. Then 
L((v n )) C conv[L((7r n ))] with probability one. 

Proof : The set C = A x conv[L((7r n ))] is adapted to (v n ,7i n ). The induced 
differential equation v G —v + conv[L((7r n ))] has a unique global attractor 
conv[L((7r n ))]. Hence, by Theorem 12.61 and Proposition \2.7\ (ii), L((v n )) C 
conv[L((7r n ))j. QED 

Corollary 4.2 Suppose that the sequence (M n ) lies in a compact subset of 
Mi n d(E) and verifies M n+ \ — M n —>■ 0. Then conclusion of proposition \J7I\ 



Proof : Follows from proposition 14.11 and proposition 13.11 QED 

Corollary 4.3 Assume that M n — > M G M in( i(E). Then v n — > it the invari- 
ant probability of M. 

Markov chains with rare transitions 

Among the well studied chains that motivate our analysis are the chains with 
rare transitions. 

Let Mo be an irreducible Markov matrix over E, reversible with respect 
to a reference probability n . That is 



We sometimes call such an M , an exploration matrix since it provides a way 
to explore the state space. 

Let W : E x E — > E, be a map and (/3 n ) a sequence of positive numbers. 



holds. 



ir (x)M (x,y) = ir (y)M (y,x). 



Set 



M n (x,y)=M{p n ,x,y) 



(6) 



where 




and 



min(l, u) 



(7) 
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or 

1 + u 

In particular, let U : -E — * R be a map, and let 

W/(x,y) = [/(y)-[/(x), (8) 

then (M n ) are the transition matrices of the so-called Metropolis- Hasting 
(/3 n = f3) or simulated annealing (/3 n — > oo) algorithm (Hajek (1982), Holley 
and Stroock (1988), Miclo (1992)). 

Consider the Markov chain with rare transitions ([6]) where W is given 
by (jSJ). For x,y G E a path 7 from x to y is a sequence of points x = 
x, xi, . . . x n — y such that M (xj, Xi+i) > 0. We let T x ^ y denote the set of all 
paths from x to y. The elevation from x to y is defined as 

Elev(x, y) = min{max{?7(2;) : z G 7} : 7 G r XjJ/ } 

and the energy barrier as 

f/ # = max{Elev(a;, y)-U(x)-U (y) + mint/ : x E E,y e E} (9) 

Proposition 4.4 Consider the Markov chain with rare transitions (0|) 

gwen fry ([#)]. Assume that (3 n = (3(n) where (3 : 1R + — > 1R + differentiable 
and verify 

< /?(t) < j 

for some A < 1/2C7*. JTien t> n — > 71 where 

ir(x) oc n (x)l Argm - mU (x). 

Proof : Our first goal is to verify hypothesis 12.11 Let A(/3) denote the 
spectral gap of M(/3, -, •). It follows from Theorem 2.1 in Holley and Stroock 
(1988) that 

ton = -C/* (10) 

The invariant probability measure of M(/3, •, •) is the Gibbs measure 

Ttpix) oc exp(— PU(x))tt (x). (11) 

Since (3 n < f3\ + A log(n), by application of the last inequality of Proposition 
13.41 one gets that hypothesis 12.11 (i) holds. 
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For x 7^ y 



dM(/3,x,y) 
dp 



-M (x, y)W(x, y)^(exp(-(3W(x, y)) exp(-(3W(x, y)). 



Using the fact that \ip'(t)t\ < 1, one gets that 

dM{P,x,y) 



dp 



< c 



for some c > 0. Hence by the mean value theorem 

\M n+1 - M n \ < c\p n+1 - p n \ < (Ac)/n. 

This proves assertion (ii) of proposition 13.31 The proof of assertion (Hi) is 
similar since 



dirplx 



dp 



\M*W(x)- y £ l Mv)u{v))\<W\\- 



This concludes the verification of hypothesis 12.11 

Here n n (x) oc exp(—p n U(x))7io(x) so that n n — > ir. The result follows 
from Proposition 14.11 QED 



Remark 4.5 For general W, it is always possible to define a quasipotential 
U (defined in term of W and M ) and an energy barrier (in general not 
given by (J9])) such that both equations (fTUI) and (TTTT) hold. We refer the 
reader to Miclo (1992) for more details and proofs. With this quasi-potential 
and barrier Proposition 14.41 holds. 



4.2 Vertex reinforced random walks 

Vertex-reinforced random walks (VRRW) were first introduced by Pemantle 
(1988, 1992). 

Suppose T n = u(Xi, . . . , X n ). A general VRRW on E is defined by 

M n (x,y) = K n (x,y,v n ) 

where for each integer n and v G A, K n (-,-,v) is a deterministic Markov 
matrix over E, which specifies the rule of the reinforcement. 
The following result was proved in Benaim (1997). 
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Proposition 4.6 Assume that there exists a [0, l]-valued sequence e n con- 
verging to at infinity such that K n (x,y,v) = K(x,y,e n ,v), that the map 
(e, v) i— > K(-, -,e,v) is continuous on [0, 1] x A and that K(-, •, e, v) is inde- 
composable for each (e, v) G [0, 1] x A. Let n{y) denote the invariant measure 
of K(x, y, 0, v ) . Then the limit set of (v n ) is almost surely an internally chain 
transitive set of the differential equation 

v — — v + 7r(f ). (12) 

Proof : This follows from Proposition 13.11 and Theorem 12.61 QED 



Linear reinforcement 



The original VRRW as defined by Pemantle (1988, 1992) corresponds to a 
linear reinforcement: 



M n (x,y) oc U(x,y) 



1 + E ^=y 

i=l 



where U is a matrix with nonnegative entries. 

We will here assume that U has positive entries. Then, for each n, M n is 
irreducible. With the notation of the previous paragraph, 



M n (x, y) = K(x, y, 1/ n, v n 
where for (e, v) G [0, 1] x A, 

K(x,y,e,v) oc U(x,y) [e + v 



(13) 



(14) 

The mapping (e, v) i— > K(-, -, e, v ) is continuous on [0, 1] x A. 

On a finite graph, this process was first analyzed by Pemantle (1992) for 
symmetric positive matrices (U (x, y) = U (y, x) > 0) and later by Benaim 
(1997) for general positive matrices using proposition 14.61 As an example of 
what can be proved is the following result first due to Pemantle (1992) 

Proposition 4.7 Suppose U(x,y) = U(y,x) > 0. Then the limit set of {y n ) 
is a compact connected subset of the critical set of the map 



U(v,v) = ^U(x,y)v(x)v(y). 



x,y 
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Proof : This follows from the fact that v \— ► U(v, v) is a strict lyapounov 
function of (1121) whose critical points are the zeroes of fll2p . QED 

When the matrix U has zero entries, K(x, y, 0, v) may no longer be in- 
decomposable for some v G dA and proposition 14.61 cannot be applied. This 
makes the analysis of VRRW with linear reinforcement much more difficult. 
Beautiful results on Z and Z d have been obtained by Pemantle and Volkov 
(1999), Volkov (2001) and Tarres (2004). We refer the reader to Pemantle 
(2007) for a survey and further references. 

Non homogeneous linear reinforcement 

Let (a n ) be a positive sequence and denote r n = Y17=i a i- We will assume 
that lim^oo = 1. Consider the VRRW corresponding to: 



where U is a matrix with positive entries. Equivalently, M n (x, y) = K(x, y, e n , w n ) 
with 



e n = l/r n and w n = -^Yl^i^Xi- Using proposition 13.11 it is not hard to 
check that hypothesis 12. II and hypothesis 12.21 (with Vi = 5x t ) are satisfied, so 
that theorem 12.61 applies. 

Since 5x t = Vi + (i — l)(vi — fj-i), using the convention r = v o = 0, 





(15) 



w, 



n 





n-1 



n-1 




Since \vi + i — Vi\ < 2/i 




18 



Consider now the two following classes of sequences (at): 

(i) dj = a(i) where a is a nondecreasing continuous function such that for 

all positive s e]0, 1], lim^^ ^ = 1. 

(ii) di = a(i) where a is a decreasing continuous function such that for all 

positive s e]0, 1], \im t ^ x ^0j- = 1, and there exists b : [0,1] — ► M + 

measurable such that b(s)ds < oo and for all (s,t) e]0, 1] x M + , 

a(ts) , . , 

< -V^ - 1 < b(s). 
~ a(t) ~ v ; 

For example a« = (log(i + l)) a satisfies (i) for a > and (ii) for a < 0. 

Lemma 4.8 Assume (i) or (ii) holds, then lim^oo \w n — v n \ = 0. 

Proof : Note that it suffices to prove that y — Oj+i = o(aj). Assume first (i) 
holds. Then 



< a i+ i r 

i 



a,- ./n V au 



= 0(di). 

Assume now (ii) holds. Then 

< — - a i+1 
% 

V at Jo V a W / 

= o(ai) 

by dominated convergence theorem. QED 

Let ir(e,v) denote the invariant probability of K(x,y,e,v) and ir(v) = 
7r(0, f ). The map (e, v) i— > 7r(e, u) is uniformly continuous. Then the previous 
lemma implies that when (i) or (ii) holds, since n n = ir(e n ,w n ), lim^oo \n n — 
ft{vn) \ — 0. This last property with theorem 12.61 implies the 

Theorem 4.9 Assume that (i) or (ii) holds, then the limit set of (v n ) is 
almost surely an internally chain transitive set of the differential equation 

v = — v + ir(v). (16) 

Note that proposition 14.71 also holds for sequences (a*) satisfying (i) or (ii). 
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Exponential reinforcement 

Let {/:£xi?^Rbea map. For x G E and usA, set 

U(x,v) = ^2u(x,y)v(y), 

W(x,y,v) = U(y,v) -U(x,v), 

{M (x,y)^[exp(-/3W(x,y,v))] if x ^ y, 
1 -J2 y ^ x K W: x ,y, v ) tix = y, 

and 

K n (x, y, v) = K(f3 n , x, y, v), (17) 

Here M is an exploration matrix, (/?„)„ is a positive sequence and ^ is 
given by ©. When /3 n = /?, such a VRRW can be seen as a discrete time 
version of the self-interacting diffusions on compact manifolds that have been 
thoroughly analyzed by Benaim, Ledoux and Raimond (2002), Benaim and 
Raimond (2003, 2005, 2006). When (3 n = A log(n), the VRRW can be seen as 
a discrete time version of the self-interacting diffusions on compact manifolds 
studied by Raimond (2006). 

Let U#(-,y) be the energy barrier as defined by equation of the map 

I h. [/(i, y) 

Theorem 4.10 Consider the VRRW with exponential reinforcement defined 
by [T7\ ). Assume that f3 n = f3(n) where (3 : M + — > M + is differentiate and 
verify 

< ${t) < j 

for some A < 1/2 max{ £/*(•, y) :y£E}.Let 

C(v) = A(Argmin[/(-,<u)) 

denote the set of probabilities supported by Argmin?7(-, v). Then the limit set 
°f ( v n) is an internally chain transitive set of 

v G —v + C(v). 
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Proof : This is an application of Theorem l2.6l The verification of hypothesis 
12.11 is similar to the one given in proposition 14.41 Details are left to reader. 

It is easily seen that C is a closed-valued set with convex values. For 
v e A, let 

ir n [v](x) oc ir (x)exp(-p n U(x,v)) 

and 

n[v](x) oc 7r (x)l A rgmini/(-,t,)(a;)- 
The invariant probability of K n is 7r n [u„] and 

lim 7r n [f](x) = 7r[t>](a;). 

n— >oo 

This proves that C is adapted to (v n ,7i n [v n ]) and the result follows from 
Theorem QED 

Corollary 4.11 (symmetric interaction) Assume that hypotheses of The- 
orem \4^T^ hold and assume furthermore that U is symmetric (i.e U(x,y) = 
U(y,x)). Then (v n ) converges almost surely to a connected component of the 
set 

{v e A : v E C(v)}. 

Proof : For u, v G A set 

U{u,v) = ^2u(x,y)u(x)v(y) 

x,y 

and let 

H{v) = ±U{v,v) 

We claim that H is a lyapouvov function of the differential inclusion (jSJ). Let 
£ i— > v(t) be a solution to (JSJ) then, for almost all t > 

jH{v(t)) = 1 -[U(v(t),v(t)) + U(v(t),v(t)} = U(v(t),v(t)) 

= U(v{t)+v(t),v{t))-U(v{t),v(t)) 
= min U(x, v(t)) - U(v(t),v(t)), 

X 

where we have used the symmetry of U, the fact that v + v G C(v) and the 
definition of C(v). Since t i— > H(v(t)) is locally Lipchitz, it is nondecreasing. 
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If now t I— > H(v(t)) is constant over a time interval, then v(t) G C(v(i)) 
over this time interval. This proves that if is a Lyapounov function for 
A = {v G A : v G C(t>)}. The result now follows from Proposition 12.91 
(compare to Benai'm, Hofbauer and Sorin (2005), Theorem 5.5) provided we 
show that if (A) has empty interior. 

Let d 6 Afl int(A). Since the mapping x i— ► U(x,v) is constant, for all 
for all u> G A, U(w,v) = U(v,v). Therefore H{y) = U(w,v) for all w G A. 
It follows that if restricted to A n int(A) is a constant map. The same 
reasonning applies to prove that if restricted to each face of A is a constant 
map. We thus have proved that if (A) takes finitely many values. QED 

Remark 4.12 Corollary 14.111 still holds true under the weaker assumption 
that the map v h- > U(x, v) is smooth and convex in v. 

Corollary 4.13 Assume that U is symmetric and nonnegative and that 

Ker{U) n TA = {0}. 

Then {v G A : v G C(v)} reduces to a singleton v* and (v n ) converges almost 
surely to v*. 

Proof : Let v G C{v),w G A and h = w — v. Since v G C(v),U(v,h) > 0. 
Thus U(w, w) — U (v, v) = 2U(v, h) + U (h, h) > 0, proving that v is a global 
minimum of v i— > U (v, v). Since U (h, h) > for h = w — v ^ 0, such a global 
minimum is unique. QED 

4.3 Games 

Consider a two-players game. We let E\ (respectively E2) denote the finite 
set of actions available to player 1 (respectively player 2) and 

U = (U\ U 2 ) : E l x E 2 -> R x R 

denote the payoff function of the game. If player 1 and player 2 choose 
respectively the actions x G E\ and y G E 2 , then player 1 gets f/ 1 (x, y) and 
player 2 gets U 2 (x,y). 

Let (pf„, Y^)) denote the sequence of plays. In noncooperative game 
theory we assume that ((X n , Y n )) is adapted to some filtration (T n ) and that 
at the beginning of round n + 1, players have no information on the action 
to be played by their opponents: for all (x, y) G E% x E 2 and n G N 

P(AT n+ i = x,F n+ i = y\T n ) = P{X n+1 = x\T n )P(Y n+1 = y\F n ). 
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4.3.1 Markovian fictitious play 

For x e E x and v 2 G A(E 2 ) set 

Let 



n 



v 2 = -) <) )r 
n 

i=l 



A well studied strategy known as "fictitious play" consists for player 1 to 
play at time n + 1 an action maximizing C/ 1 (-,t^), that is 

X n+1 eArgmaxZ/ 1 ^). (i 8 ) 

This strategy relies on the idea that in absence of information on the next 
move of his opponent, player 1 assumes that he (the opponent) will play ac- 
cordingly to the past empirical distribution of his moves. While fictitious play 
was originally proposed in 1951 by Brown as an algorithm to compute Nash 
equilibria it has been recently rediscover as a "learning model" (Fudenberg 
and Kreps (1993); Fudenberg and Levine (1998)) and has been extensively 
studied (Monderer and Shapley (1996); Benaim and Hirsch (1999); Hofbauer 
and Sandholm (2002); Benaim, Hofbauer and Sorin (2005, 2006), see also 
Pemantle (2007) for an overview and further references). 

Fictitious plays requires to solve the maximization problem (fl~8"|) at each 
stage of the game. If the cardinal of E\ is too large (or if players have com- 
putational limits) such a computation may be problematic. An alternative 
strategy proposed first in Benaim, Hofbauer and Sorin (2006), based on pair- 
wise comparison of payoffs, is as follows: The strategy of player 1 is such that 
P(X n+ i = ylJ-'n) = M n (X n ,y) with M n the Markov matrix defined by 

{M (x, y)ip[exp(-f3 n W n (x, y))] if x ^ y, 
(19) 
1 - Hy?x M n{x, y) if a; = j/, 

where 

W n (x,y) = U 1 (x,v 2 n )-U 1 (y,v 2 n ), 

Mq is an exploration matrix, ip is given by ([7j) and f3 n is an increasing positive 
sequence. Such a strategy will be called a Markovian fictitious play strategy. 
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Adopting the view point of player 1, we choose, as an observation space, 

£ = A(Ei) x A(E 2 ) 
and as an observation variable 

V n = (Sx n ,5 Yn ). 

Hence (v n ) is the empirical frequency of the actions played up to time n, and 

V n (x) = (8 X , u n ), 

where v n = E(5y n+1 \ F n )- 

We let U l '^{y) denote the energy barrier, as defined by Qj, of the map 
x i — > U 1 ^, y). 

Theorem 4.14 Assume that player 1 plays a Markovian fictitious play strat- 
egy as given by / TiPj) . Assume that f3 n = (3(n) where (3 is differentiable, 
lim^oo (3(t) = oo and verify 

< ${t) < j 

for some A < l/2max{C/ 1, *(y) : y G E2}. 
Forv = (v\v 2 ) G A(Et) x A(E 2 ) let 

d{v 2 ) = A(Argmaxt/ 1 (-,t; 2 )) 

and 

C(v)=C 1 (v 2 ) x A{E 2 ) 
Then the limit set of (v n ) is an internally chain transitive set of 

v G —v + C(v). 

Proof : This is still an application of Theorem 12.61 The verification of 
hypothesis 12.11 is similar to the one given in proposition 14.41 Let 

TT n [v 2 ](x) OC 7T (x)exp(/?„[/ 1 (x,f 2 )) 

and 

Tt[v 2 ](x) OC 7T (x)l Argmax(C/ i ( . > , ) 2 )) (x). 
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Then, the invariant probability of M n is 7r n = 7r n [t> 2 ] and 6 n = ir n V n = (7r n , u n ) 
with v n = E(5y n+1 \J 7 n ) . Since 7r n [t> 2 ] — > ir[v 2 ] G C 1 (f 2 ) it follows that C is an 
adapted graph. QED 

Much more can be said under the assumption that both players adopt 
a Markovian fictitious play strategy: P(X n+i = y\J 7 n ) = M^(X n ,y) and 
P(Y n+1 = y\T n ) = M*(Y n , y), with and M 2 the Markov matrices defined 
by (with i G {1,2}) 

( M {x,y)ip[exp{-(3 l n Wl{x,y))} ifx^y, 
M i n {x,y)=l (20) 

where 

Wfcy) = U\xy n )-U\y,vl), 
W 2 n {x,y) = U*(vlx)-U*(vl,y), 

Mq is an exploration matrix, ip is given by (GO) and f} l n is an increasing positive 
sequence. 

Let Conv(£7) denote the convex hull in IR 2 of the set {U(x, y) : x G Ei, y G 
E2} of all possible payoffs. We now choose 

E = A (Ei) x A(E 2 ) x Conv(fT) 

as an observation space, and 

V n = {6 Xn ,6 Yn ,U{X n ,Y n )) 

as the observation variable. Hence 

V n (x,y) = (5 x ,5 y ,U(x,y)). 



Theorem 4.15 Assume that both players adopt a Markovian fictitious play 
strategy. Assume that for i G {1,2}, (3 l n = (3 l {n) where (3 l is differentiate 
and verify 

< p(t) < y 

for some A 1 < 1/2 max{ U l, #(y) : y G E 3 _^}. 
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Forv= (v\v 2 ,u) G A (Ei) x A(E 2 ) x Conv([/), Zet 

C7(v) = {{(«,/?, 7) e S : a G Cxiv^pe C 2 (A7 = tf(a,/3)} 

where Ci{y 2 ) is like in Theorem \4-H\ an d C^v 1 ) is analogously defined for 
player 2. Then the limit set of (v n ) is an internally chain transitive set of 

v G —V + C(v). 

Proof : Let (M^) denote the strategy of Player i. Let n l n , X l n be the invariant 
measure and spectral gap of M % n . On the state space E\ x E 2 the strategy of 
the pair of players is M n = M\ ® M 2 which invariant measure is Tr n = n\ <S)7T 2 
and spectral gap X n = min(A^, A 2 ). Thus hypothesis 12. II holds for (M n ). The 
rest of the proof is similar to the proof of Theorem 14.141 and is left to the 
reader. QED 

Corollary 4.16 (zero sum games) Suppose that U 2 = —U 1 . Then under 
the assumption of Theorem \4-15[ {v^v 2 ) converges almost surely to the set 
of Nash equilibria 

{(vi,v 2 ) : v x G Ci(f 2 ),f 2 G C^v 1 )}, 
and (U 1 (X n ,Y n )) converges almost surely to the value of the game 
ii* = max min U l (v 1 ,v 2 )= min max U 1 (v\v 2 '). 

Proof : This follows from theorem 12.61 proposition 12.71 (ii) and the fact that 
the set {(vi,V2,u) : V\ G C\{y 2 ),v 2 G C^v 1 )^ = u*} is a global attractor of 
the differential inclusion, as proved in full generality by Benaim, Hofbauer 
and Sorin (2005). QED 

Corollary 4.17 (Potential games) Suppose that U 2 = U 1 . Then under 
the assumption of Theorem \4-15\ {v^v 2 ) converges almost surely to a con- 
nected subset of the set of Nash equilibria 

{(vi,v 2 ) : Vi G C 1 {v 2 ),v 2 G C^v 1 )} 

on which U 1 is constant, and (U 1 (X n , Y n )) converges almost surely towards 
this constant. 

Proof : Follows from theorem l2.61 proposition ^. 91 and the fact that U 1 = U 2 
is a Lyapounov function of the differential inclusion. The proof of this later 
point is given in (Benaim Hofbauer and Sorin, 2005, Theorem 5.5). It is 
similar to the proof Corollary 14.111 QED 
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4.3.2 A remark on hypothesis 12.21 



We give here a simple example showing the necessity of hypothesis 12.21 
Consider the zero sum game where E\ = Ei = {0, 1}, U 1 = —U 2 and 



U 



-1 
-1 



Let V n = U x (X n , Y n ) be the payoff to player f at time n. One has 

V n {x) = U\x, \)v n + U l {x, 0)(1 - v n ) 

with v n = E(y n+ i\F n ). 

Suppose player f adopts the strategy given by 



M„=M 



e i -e 
1-e e 



for some < e < 1. Then ir n = tt with tt(0) = 7r(l) = 1/2 and 

On = 7l n V n = -1/2 

regardless of the strategy played by 2. 

Suppose now that player 2 plays Y n+ i = X n for all n > 1. For e ^ 1/2 
hypothesis 12.21 is not verified and the prediction given by (a wrong application 
of) theorem 12.61 fails since 

-> 7r(x)Af(x, y) = -(1-e). 



5 Proof of Theorem 12.6 



Let F denote a set-valued function mapping each point x G lR m to a set 
-F(x) C M m . We call F a standard set valued-map provided it verifies the 
three following conditions: 

(i) F is a closed set- valued map. That is 

Graph(F) — {(x, y) : y G F(x)} 
is a closed subset of 
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(ii) F has nonempty compact convex values, meaning that F(x) is a nonempty 

compact convex subset of IR m for all x G M m . 

(iii) There exists c > such that for all x G M. m 

sup \\z\\ < c(l + ||x||) 

zeF(x) 

where || • || denotes any norm on R m . 

Given a standard set-valued map F, set 

F 5 (u) = {w G M m : 3v G R m : d(u, u) < 5, d(iu,F(u)) < 5}. 

The following proposition follows from the results of Benaim, Hofbauer and 
Sorin (2005). 

Proposition 5.1 Let (x n ) and (U n ) be discrete time processes living in R m 
and (7„) a sequence of nonnegative numbers. Let (F n ) be a sequence of set- 
valued maps and let F be a standard set valued-map. Assume that 



(i) 
(") 



X n +1 - Xn- ln+lU n+ l G 7„+iF„(x„) 

E7„ = oo, lim 7„ = 0. 
n. — >no 



(iii) For all T > 



lim sup 



where 



and 



k-l 



: k = n + 1, . . . ,m(r n + T) 



7i 



i=l 



m(t) = sup{/c > : t > r k } 



(21) 



(iv) sup n ||x n || = M < oo, 
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(v) For all 5 > there exists such that 

F n {x n ) C F s {x n ) 

for all n > riQ. 

Then the limit set of (x n ) is an attractor free set of the dynamics induced by 
F. 

Remark 5.2 This proposition is purely deterministic. If the (x n ), (U n ) are 
random processes, the assumptions have to be understood almost surely. 

Remark 5.3 If condition (v) is strengthen to F n = F, Proposition 15.11 fol- 
lows from Proposition 1.3 and Theorem 4.3 of Benaim, Hofbauer and Sorin 
(2005). Under the weaker hypothesis (t>), it suffices to verify that the argu- 
ments given in the proof of Proposition 1.3 adapt verbatim. 

With the notation of the preceding sections, write 

v n+ i -v n = — — T [-v n + V n+1 ] = ——r[-v n + 9 n + U n+ i] 
n + 1 n + 1 

where 

U n+ i = V n+ i — 9 n . (22) 

Hence, conditions (z), (ii) and (iv) of the previous proposition are satisfied 
with F n {u) = —u + C n {u) and 7„ = -. Condition (t>) follows from the next 
lemma. 

Lemma 5.4 Let C be adapted. For «6S and 5 > set 

C 5 (u) = {w G S : 3v G E : d(u, v) < 5,d(w,C(v)) < 5}. 
Then for all 5 > there exists hq such that 

C n (u) C C\u) 

for all n > n and u G p(C n ). 

Proof : Let T n = p(C n ). Assume to the contrary that there exist sequences 
u nk G T nk and v nk G C nk (u nk ) such that n k -> oo and v Uk <£ C 5 (u rik ). By 
compactness we may assume that u nk —>■ u,v nk — > v G C(u). Hence for n k 
large enough d(u nk ,u) < 5 and d(v nk ,v) < 5 proving that v nk G C 5 {u nk ). 
QED 

To conclude the proof of theorem 12.61 it remains to verify condition (Hi) 
of proposition 15.11 
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Lemma 5.5 Under hypothesis \2.1\ and \2.2\ the sequence (U n ) defined by 
verifies hypothesis {Hi) of proposition [3T71 

Proof : Set ^U n+1 = e° n+l + e n+1 with 



-n+l 



71 + 1 



and 



^n+1 



n + l 
1 

n + l 



(V n (X n+ i) - n n V n ) 
{Qn — M n Q n )V n (X n+ i) 



where the last equality follows from the definition of Q n . Now, write e n+ i 
Eti 4+1 , where 



-n+l 



n + 
1 



T [QnK(X n+1 ) - M n Q„K(X n )], 



e; +1 = — — M n Q n K(X n ) - -M n Q n V n (X n ), 
n + l n 

^n+i = —M n Q n V n (X n ) — -M n+1 Q n+ iV n+ i(X n+ i), 

n n + l 



~ n+1 n + 1 
1 



M n+ iQ n+ i(V n+ i — V n )(X n+ i / 



-n+l 



n + l 



[M n+ iQ n+ iV n (X n+ i) — M n Q n V n (X n+ i)]. 



For i = 0, . . . , 5, let 



= sup 



fc-i 



E4« 



: /c = n + 1, . . . , m(r n + T) 



Since £ is compact there exists a finite constant i? such that 

HK|| + ^||V;(x)|| < R. 



The sequence (e°) is a martingale difference with 



Edl^illW^i^/Cn+l) 1 
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Therefore, by Doob's convergence theorem for L 2 martingales, lim^oo e°(T) - 
a.s. 

The sequence (e*) is a martingale difference with | |e^ +1 | | < R\Q n \/(n + l). 
Thus by a classical application of exponential martingale inequality (inequal- 
ity (18) in Benaim (1999)) we have for all positive a, 



P(4(T) >a) <cexp 



-a 2 



cY™=n +T \R 2 \Qi\ 2 /i 2 ) 



for some positive constant c. By hypothesis 12.1} for any e > and n large 
enough (note that (n — l)e T < m(r n + T) < ne T ) 

m(r n +T) m(r n +T) , 

^ v i«*ti / y- ^ i log z - log n 

Thus 

P(4(T) > a)) < oo 

and lim^oo t l n+1 = a.s. by Borel-Cantelli Lemma. 
For n + 1 < k < m{r n + T), 

j=n j=n yJ JJ 

Thus 

m(r n +T) 



By hypothesis 12.11 this goes to zero a.s. when n — > oo. 
For n + 1 < k < m(r n + T), 

it— l 

V £j 3 = -M n Q n K(X n ) - ]-M k Q k V k (X k ), 
^— ' J n k 

j=n 

so that 

<£(T) <2 J Rsup-|g i | 
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and e^(T) — > a.s. as n — > oo by hypothesis 12.11 
The term e*(T) is dominated by 

(T + 1) sup sup |Af i+ iQ i+ i(Vi + i - V5)(a;)| 

which converges a.s. towards as n — > oo by hypothesis 12.21 
Finally, since M n Q n = Q n + Tl n - I 

(Qn+i — Qn)v n (x n+ i) + (n n+ i — n n )v^ . 

Hence 

4(T) < R(T + 1) sup(|Q i+1 - Qi\ + - n \) -> 
a.s. by hypothesis [2TTJ This completes the proof of (hi). QED 
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